177 research outputs found
Generalization error bounds for iterative learning algorithms with bounded updates
This paper explores the generalization characteristics of iterative learning
algorithms with bounded updates for non-convex loss functions, employing
information-theoretic techniques. Our key contribution is a novel bound for the
generalization error of these algorithms with bounded updates, extending beyond
the scope of previous works that only focused on Stochastic Gradient Descent
(SGD). Our approach introduces two main novelties: 1) we reformulate the mutual
information as the uncertainty of updates, providing a new perspective, and 2)
instead of using the chaining rule of mutual information, we employ a variance
decomposition technique to decompose information across iterations, allowing
for a simpler surrogate process. We analyze our generalization bound under
various settings and demonstrate improved bounds when the model dimension
increases at the same rate as the number of training data samples. To bridge
the gap between theory and practice, we also examine the previously observed
scaling behavior in large language models. Ultimately, our work takes a further
step for developing practical generalization theories
Correntropy Maximization via ADMM - Application to Robust Hyperspectral Unmixing
In hyperspectral images, some spectral bands suffer from low signal-to-noise
ratio due to noisy acquisition and atmospheric effects, thus requiring robust
techniques for the unmixing problem. This paper presents a robust supervised
spectral unmixing approach for hyperspectral images. The robustness is achieved
by writing the unmixing problem as the maximization of the correntropy
criterion subject to the most commonly used constraints. Two unmixing problems
are derived: the first problem considers the fully-constrained unmixing, with
both the non-negativity and sum-to-one constraints, while the second one deals
with the non-negativity and the sparsity-promoting of the abundances. The
corresponding optimization problems are solved efficiently using an alternating
direction method of multipliers (ADMM) approach. Experiments on synthetic and
real hyperspectral images validate the performance of the proposed algorithms
for different scenarios, demonstrating that the correntropy-based unmixing is
robust to outlier bands.Comment: 23 page
An Extended Result on the Optimal Estimation under Minimum Error Entropy Criterion
The minimum error entropy (MEE) criterion has been successfully used in
fields such as parameter estimation, system identification and the supervised
machine learning. There is in general no explicit expression for the optimal
MEE estimate unless some constraints on the conditional distribution are
imposed. A recent paper has proved that if the conditional density is
conditionally symmetric and unimodal (CSUM), then the optimal MEE estimate
(with Shannon entropy) equals the conditional median. In this study, we extend
this result to the generalized MEE estimation where the optimality criterion is
the Renyi entropy or equivalently, the \alpha-order information potential (IP).Comment: 15 pages, no figures, submitted to Entrop
- …